Skip to main content

Running System Commands in Python

Python allows interaction with the operating system by executing system commands directly from scripts using the subprocess module.

The subprocess Module

The subprocess module enables you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.

The subprocess.run Function

  • Purpose: Execute a command, wait for it to complete, and get the result.
  • Returns: A CompletedProcess instance containing details about the executed command.

Example:

import subprocess

result = subprocess.run(["date"])
  • The command is specified as a list, where the first element is the command and the subsequent elements are its arguments.
  • In this example, the date command displays the current date and time.

Blocking Behavior

  • The parent process (your Python script) is blocked while the child process (the system command) is running.
  • The script resumes execution only after the child process completes.

Example with sleep:

import subprocess

subprocess.run(["sleep", "2"])
  • This command causes the script to pause for 2 seconds.
  • During this time, the script is blocked and cannot perform other tasks.

Handling Command Return Codes

  • The CompletedProcess object has a returncode attribute.
  • A returncode of 0 indicates successful execution.
  • A non-zero returncode indicates an error occurred.

Example:

import subprocess

result = subprocess.run(["ls", "non_existent_file"])
print("Return code:", result.returncode)
  • Since the file does not exist, ls returns a non-zero exit status.
  • You can use the returncode to handle errors in your script.

Executing Commands with Arguments

  • Additional command-line arguments are included in the list after the command.

Example:

import subprocess

subprocess.run(["ls", "-l", "/usr"])
  • This runs ls with the -l option on the /usr directory.

Obtaining the Output of a System Command

To process the output of a system command within your Python script, capture it using the capture_output parameter.

Capturing Standard Output and Standard Error

  • Set capture_output=True in subprocess.run() to capture the command's output.
  • The stdout and stderr attributes of the CompletedProcess object contain the captured output.

Example:

import subprocess

result = subprocess.run(["host", "8.8.8.8"], capture_output=True)
  • The host command resolves hostnames to IP addresses and vice versa.
  • By capturing the output, you can parse and manipulate the data.

Accessing and Decoding the Output

  • The stdout and stderr attributes are byte strings (bytes objects).
  • To convert them to standard Python strings, decode them using decode().

Example:

output = result.stdout.decode()
print("Output:", output)
  • Decoding uses UTF-8 encoding by default.

Parsing the Output

  • Once decoded, you can split or parse the output as needed.

Example:

output = result.stdout.decode()
output_parts = output.split()
print("Parsed Output:", output_parts)
  • This splits the output string into a list of words.

Extracting Specific Information

Extracting the Hostname from an IP Address:

import subprocess

result = subprocess.run(["host", "8.8.8.8"], capture_output=True)
output = result.stdout.decode().split()
hostname = output[-1].strip('.')
print("Hostname:", hostname)
  • Retrieves the last element of the output, which is the hostname associated with the IP address.

Handling Standard Error

  • If a command writes output to standard error, it is captured in the stderr attribute.

Example:

import subprocess

result = subprocess.run(["rm", "does_not_exist"], capture_output=True)
error_output = result.stderr.decode()
print("Error Output:", error_output)
  • Since the file does not exist, rm outputs an error message to standard error.
  • Capturing stderr allows you to handle errors gracefully.

Understanding Byte Strings and Encoding

When capturing output from subprocesses, the data is returned as byte strings (bytes objects), indicated by a leading b in the output (e.g., b'output').

Why Byte Strings?

  • Subprocesses communicate through byte streams, not Python strings.
  • This allows for binary data and text in various encodings to be transmitted.

Decoding Byte Strings

  • Use the decode() method to convert a byte string to a Python string.
  • By default, decode() uses 'utf-8' encoding, which is standard for Unicode text.

Example:

byte_output = result.stdout
string_output = byte_output.decode('utf-8')

Specifying Encodings

  • If the subprocess outputs data in a different encoding, specify it in decode().
  • Alternatively, use the text=True parameter in subprocess.run() to automatically decode outputs.

Example with text=True:

result = subprocess.run(["host", "8.8.8.8"], capture_output=True, text=True)
print(result.stdout)
  • When text=True, stdout and stderr are returned as strings, not bytes.

Advanced Subprocess Management

The subprocess module provides additional parameters for more control over process execution.

Modifying Environment Variables

You can modify the environment variables for the subprocess using the env parameter.

Copying and Modifying the Environment

  • Use os.environ.copy() to get a copy of the current environment.
  • Modify the environment variables as needed.
  • Pass the modified environment to subprocess.run() via the env parameter.

Example:

import os
import subprocess

# Copy the current environment
my_env = os.environ.copy()

# Modify the PATH environment variable
my_env["PATH"] = os.pathsep.join(["/opt/myapp/", my_env["PATH"]])

# Run the command with the modified environment
result = subprocess.run(["myapp"], env=my_env)
  • Adds /opt/myapp/ to the PATH, allowing the subprocess to find myapp.

Changing the Working Directory

Set the cwd parameter to specify the working directory for the subprocess.

Example:

import subprocess

# Run 'ls' in the '/usr' directory
subprocess.run(["ls"], cwd="/usr")
  • The command is executed as if the current directory is /usr.

Setting a Timeout for the Process

Use the timeout parameter to specify a maximum execution time for the subprocess.

Example:

import subprocess

try:
# Attempt to sleep for 10 seconds, but timeout after 5 seconds
subprocess.run(["sleep", "10"], timeout=5)
except subprocess.TimeoutExpired:
print("The command timed out.")
  • If the command exceeds the specified timeout, a TimeoutExpired exception is raised.

Executing Commands via the Shell

Set shell=True to execute the command through the shell.

Example:

import subprocess

# Using shell=True to expand shell variables
subprocess.run("echo $HOME", shell=True)
  • Allows the use of shell features like variable expansion and wildcard patterns (globs).

Security Warning:

  • Using shell=True can be a security hazard, especially if you're constructing the command string from user input.
  • It can introduce shell injection vulnerabilities.
  • Always validate and sanitize any user input if you must use shell=True.

Additional Parameters

Input to the Subprocess

  • Use the input parameter to send data to the subprocess's standard input.

Example:

import subprocess

# Send input to a command
result = subprocess.run(
["grep", "hello"],
input="hello world\nhello python",
text=True,
capture_output=True
)
print(result.stdout)
  • The text=True parameter tells subprocess to handle inputs and outputs as strings rather than bytes.

Check for Errors Automatically

  • Use check=True to automatically raise an exception if the subprocess exits with a non-zero status.

Example:

import subprocess

try:
subprocess.run(["false"], check=True)
except subprocess.CalledProcessError as e:
print(f"Command failed with return code {e.returncode}")

Best Practices and Considerations

  • Portability: Be cautious when using system commands; they may not be portable across different operating systems.
  • Dependency Management: Relying on external commands can introduce dependencies that may not be present in all environments.
  • Security: Avoid using shell=True when possible. If you must use it, ensure that the command strings are not constructed from untrusted input.
  • Use Python Modules When Possible: Prefer built-in or external Python modules over system commands for better portability and maintainability.
  • Error Handling: Always handle exceptions such as TimeoutExpired and CalledProcessError to make your scripts robust.